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Abstract 

There are some papers which describe the use of bootstrap techniques in 
point process statistics. The aim of the present paper is to show that the form 
in which bootstrap is used there is dubious. In case of variance estimation 
of pair correlation function estimators the used bootstrap techniques lead to 
results which can be obtained simpler without simulation; furthermore, they 
differ from the desired results. The problem to obtain confidence regions for 
the intensity function of inhomogeneous Poisson processes can be easily solved 
without bootstrap techniques. 
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1 Introduction 

Recently, bootstrap is a popular tool in many branches of statistics, also for stochas- 
tic processes. Thus it is natural to ask whether bootstrap techniques could be helpful 
also in point process statistics. Indeed, some authors have developed statistical pro- 
cedures using bootstrap techniques, see e.g. 0, Q and Q. All these papers deal 
with the estimation of the accuracy of estimators of point process characteristics. In 
the first three papers estimation of variance of pair correlation function estimators 
is treated. The last one presents a procedure to determine confidence regions for the 
intensity function of an inhomogeneous Poisson process. 

The fundamental idea of bootstrap to resample given data to obtain 'new' pseudo 
data appears also in statistics of stochastic processes, in particular in the analysis of 
time series, see e.g. H. In some variants of the method, called the blockwise boot- 
strap, the time series is partitioned into several parts, which are then resampled. A 
similar idea is also applied in Q in the statistical analysis of a planar random set. 
Clearly, the partioning procedure can also be adapted to point process statistics. 
However, partition can destroy point structures or add new artificial structures to 
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the point pattern. In the one-dimensional case the error resulting from this loss of 
information may be still acceptable, but in higher dimensions it will be serious. Thus 
in spatial point process statistics another method is used which is quite similar to the 
application of bootstrap in case of classical statistics: the points of the process (in- 
cluding their places, which are assumed to be pairwise different) are resampled. The 
pseudo pattern then consists of n points x*, . . . , which are obtained by sampling 
randomly with replacement n times from the original data {x\, . . . , x n }. 

Naturally, the pseudo patterns generated by this method have always multiple 
points. Thus they have a character different to that of the original, which does not 
have multiple points. 

Consequently, it would be surprising if quantities of such point processes would 
produce good estimators for quantities of the original point process. 

This paper analyses the pointwise resampling technique for some examples of 
point process statistics. Section |2| discusses the main ideas of the paper Q which 
presents a procedure for estimating the standard error of an estimator of the pair 
correlation function. In Section || a method (drawn from ||) to determine confi- 
dence intervals for the intensity function of an inhomogeneous Poisson process is 
considered. Finally, an easier method is presented which yields confidence regions 
for the intensity function of an inhomogeneous Poisson process without bootstrap. 



2 Variance of estimators of the product density 

This part discusses the main ideas of [§], where bootstrap techniques are used to 
approximate the standard error of a pair correlation function estimator. The calcu- 
lations are presented in an abridged form; the complete calculations are given in the 
Appendix. 



2.1 Fundamentals 

Let be a stationary and isotropic point process, see, for example, M for definitions. 
A standard second order characteristic of $ is the product density function g( 2 \r). 
This function can be interpreted heuristically as follows. If B\ and B2 are two 
infinitesimally small disjoint Borel sets of volumes dVi and &V2 and if x\ £ B\ and 
X2 £ B2 are points of distance — X2W = r then g^ (r)dV"idV2 is the probability 
that $ has a point in each of B\ and i?2- A simple estimator of g( 2 ' without any 
border correction is given by 

^ = 2^W) K(r-\\x-y\\). 

The summation goes over all point pairs with different members, W denotes the 
window of observation and K is a kernel function. 
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This situation can be generalized to the case of any 'two-point estimator' 

x,y€$ 

with / being symmetrical in its arguments and of the form 

f(x,y) = l w {x)lw{y)h{x,y) 

with some function h. As the special form of / leading to g(r) is unimportant, the 
following calculations are carried out for a general 9. 

The quantity of interest is the variance of 9 which is given by 

V9 = BP - (E9) 2 = s 4 + 4s 3 + 2s 2 - (E9) 2 (2) 

with 




, . . . , Xi) f (xi, x 2 ) f (xi-i, Xi) dxi . . . dxi, 



where is the ith order product density function of $>, see the Appendix. 
2.2 Bootstrap version of 6 

Assume that a sample of <3? is given which consists of n points x±, . . . ,x n in the 
observation window W. It is resampled N times to obtain N 'new' point patterns. 
Each pseudo pattern consists of n points x*, . . . , x* n which are obtained by sampling 
randomly with replacement n times from {x±, . . . , x n }. Thus it happens that in 
the pseudo samples some points of the original point pattern do not occur while 
others occur twice or even more. Let the number of occurrences of X{ in the fcth 
sample be w^ii). Then the kth sample can be represented by the vector Wk = 
(wk(l), . . . ,Wk(n)) which has a multinomial distribution. This distribution depends 
only on n. In the limiting case n — > oo the components Wk{i) of Wk are independent 
and Poisson distributed with mean fx = 1. 

The bootstrap estimate for the kth pseudo sample is 

n 

§*=^2 f(xi,Xj)w k (i)wk(j), k = l,...,N 

where the summation goes over all pairs with i ^ j. The variance of 9 is 

estimated by the usual variance estimator corresponding to the 0£'s, 

— 1 N ( ~ 1 N A 2 
fc=l \ 1=1 / 
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Since the 9^ are (conditionally on xi,...,x n ) independent and identically dis- 
tributed, it is 



lim v 



x = vet 

= E§f - (E9*) 2 

n 

= a A ^ f(xi,Xj)f(x k ,xi) 

i,j,k,l=l 
n 

+ 4a 3 ^2 f(xi,Xj)f(xi,x k ) 
i,j,k=l 



(3) 



with 



+ 2a 2 Y^ (f( x i> x j)Y 



«4 
a 2 



Etoi(l)iui(2)«;i(3)«;i(4) - (E^i(l)u>i(2)) 
E(w 1 (l)) 2 w 1 (2)w 1 (3) - (Ew 1 (l)w 1 (2)) 2 



EK(l) Wl (2)) 2 -(E Wl (l) Wl (2))' 



where the expectations are conditionally on fixed x±, . . . ,x n . All the cc, can be 
calculated numerically and depend only on n (see the Appendix). Thus the result 
of the whole bootstrap procedure for N — > oo can be simply obtained by direct 
computation. 

2.3 Expectation of v% 

The futility of v*^ is demonstrated by the fact that it does neither estimate what 
is hoped (the variance of 9) nor a multiple with a fixed factor. To show this, the 
unconditional expectation of v* N is determined, see the Appendix. Since the result 
is not very transparent, here an approximation is given which makes it possible to 
characterize the quality of v* N . 

Assume that the W).(i) are independent and Poisson distributed with parameter 
[x = 1; this simplifying assumption is exact in the limiting case n — > oo, see above. 
This leads to a result which is close to the exact value for large n and is easy to 
interpret. By the way, the simplification is equivalent to replacement of n by n* in 
each pseudo sample where n* is a Poisson distributed number with mean fi = n. In 
this scheme each pseudo sample consists of a random number of points. 

The result is 

lim Ev* N = E lim v* N = 4s 3 + 6s 2 , (4) 
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see the Appendix, while the desired result, given by (Q), is 

2 

) ■ 



V6 = S4 + 4s 3 + 2s 2 - (EflY 



Remark: The formulae suggest that the bootstrap result (Q) can considerably 
differ from the true variance of 9. Nevertheless, the bootstrap procedure may make 

sense. In some cases S4 converges to (^^j with growing W and S3 is small com- 
pared with S2- Then the bootstrap result (||) may approximate three times the true 
variance, see j2j. 

3 Confidence regions for the intensity function of an 
inhomogeneous Poisson process 

The paper || presents a procedure which uses bootstrap techniques to determine 
confidence regions for the intensity function of an inhomogeneous Poisson process. 
The confidence regions are estimated using a kernel estimator. The following dis- 
cusses the main idea of that paper and shows that, as above, it is not necessary to 
carry out the bootstrap procedure. 

3.1 Fundamentals 

For simplicity, the following calculations are carried out for an one-dimensional point 
process, but they could be easily generalized to higher-dimensional processes. 

Consider an inhomogeneous Poisson point process with unknown intensity 
function A(x) in the interval (0,1), with points < x\ < x% < . . . < x n < 1. A 
kernel estimator for X(x) is used as 



i=i ^ ' 



where K is a kernel function and h bandwidth (see, for example, y]). 
Define ^ ^ 

m/ s A(x) - EA(x) 

T(x) = y ' v -, < x < 1, 

and, for a given a with < a < 1, 

tJx) = min {t : P{|T(x)| < t} > 1 - a} , < x < 1. 

Then an estimate of a confidence region for A(x) of level 1 — a is the interval 



C{x) 



\{x) - t a (x)^X(x),X(x) + t a (x)\J\(x) 



< x < 1, 



where the left border is set on if it is negative. 
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3.2 Bootstrap versions 



Since the distribution of T is not available (because the intensity function is un- 
known) it is approximated by simulation of pseudo data, see Q. A set of pseudo 
data is obtained by drawing by sampling randomly with replacement 

n* times from {x±, ... ,x n }, where n* has a Poisson distribution with mean n (this 
is method 2 in Q and similar to the simplified case in Section |2.3| ). The number 
of occurrences of Xj in the kth sample is a random variable, denoted as above by 
u)fc(i). All the Wk(i) are independent and Poisson distributed with mean A = 1 for 
i = 1, ... ,71. 

For given a with < a < 1 the bootstrap versions of the quantities defined 
above are 



t* a (x) 
C*(x) 



i n 



X 



i=i v 
X*(x)-X(x) 



x e (o,i) 



min {t : P* {\T*(x)\ < t} > 1 - a} 



X(x) - t* a (x)^X(x), X(x) + t* a (x)^/X(x) 
P( • \{x±, . . . , x n }) is the distribution conditionally on 



where P*(-) 

{xi , . . . , x n y. 

The determination of t* a (x) can be carried out by simulation. However, a faster 
and simpler possibility uses the well-known fact that the sum of independent Poisson 
distributed random variables is also Poisson distributed. It is demonstrated here for 
the simple rectangular kernel function 

K{x)= l --l { _ lA {x). 

For other kernels, similar calculations are possible. Let p(x) be the number of 
observed points in the interval [x — h, x + h] . Then its bootstrap version p*(x) is 
a random variable which is Poisson distributed with mean p(x). Its cumulative 
disribution function is denoted by F*. Thus, for given a, 



t*(s) = min{t : P*{|3^(x)| < t} > 1 -a} 

p(x) 



mm 



ji:P*{| 



i) -a x (t) < b x (t) } > 1 



a 



(5) 



min {t : F* (a x (t) + b x (t)) - F* (a x (t) - b x (t)) > 1 - a} , 



with 



a x(t) = p(x)+ht 2 , 
b x (t) = t^/2hp{x) + h 2 t 2 
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The calculations are of an elementary nature using that A£ is equal to Jr- Yli=( w k{i)- 
3.3 Confidence regions without bootstrap 

Of course, in case of an inhomogeneous Poisson process it is easy to build confidence 
regions without the bootstrap methodology. Assume that the intensity function 
\{x) is approximately linear in the interval [x — h, x + h] . Then 2h\(x) using the 
rectangular kernel is Poisson distributed with mean 2hX(x). (The rectangular kernel 
could be replaced by another kernel; then the corresponding calculations become a 
bit more difficult.) Therefore, known confidence regions for the Poisson parameter 
can be used, see for example ||. Thus it is easy to build the desired confidence 
region for A(x). 

This result corresponds to a general observation. If a parametric statistic prob- 
lem is given, then parametric estimators lead usually to better results than bootstrap 
techniques. 

Acknowledgements: I am most grateful to Dietrich Stoyan for his great encouragement and 
for helpful discussions. 



Appendix 

Here the derivation of some equations of Section § is given. 

Equation (§) 

For 

= Y^ f(x,y)= Yf l w (x)l w (y)h(x,y) 

x,y£& x,y£& 

it is 




+ 4E Yf f(x,y)f(x,z) 

+ 2E Y? (/(*,V)) 2 
= J Q ( - 4 \xi,X2,x 3 ,X4)f(x±,X2)f(x 3 ,X4)dx 1 dx 2 dx 3 dx4 
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with 

s 

Equation (^) 

For 



it is 



+ 4 J ^ (3) (xi,x 2 ,a;3)/(xi,a;2)/(xi,X3)dxidx2dx3 

+ 2 j q^(xi,x 2 ) (f(x 1 ,x 2 )) 2 dx 1 dx 2 
s 4 + 4s 3 + 2s 2 

, Xi)f(xi, x 2 )f{xi-i, Xi)dx\ . . . dxi. 



n 

^1 = 5^ f(xi,Xj)wi(i)wi(j) 



E0f~ = E0J" - E(9J fl * 



2 

n 

7^ 



E ^ f(x i ,x j )f(x k ,xi)w 1 (i)wi(j)w 1 (k)wi(l) 

i,j,k,l=l 
n 

+ 4E J] 

ij,fe=l 

+ 2E ^ (/(x i ,x i )) 2 («; 1 (^i(j)) 2 

n 

^ /(xi, Xj)f(x k , xi)Ewi {i)wi (j)wi (k)wi (I) 

i,j,k,l=l 
n 

+ 4 ^2 f(xi,x j )f(x i ,x k )E(wi(i)) 2 w 1 (j)wi(k) 

i,j,k=l 
n 

+ 2Y? (f(x t ,x 3 )) 2 B(w l (i)w 1 (j)) 2 
i>j=i 

n 

Eioi(l)u; 1 (2)tt;i(3)u)i(4) ^ f{x i ,x j )f(x k ,xi) 

i,j,k,l=l 
n 

+ 4E( Wl (l)) 2 Wl (2) Wl (3) ^ # f(x hXj )f(x u x k ) 

i,j,k=l 

n 



and 



j)f{x k ,xi)w 1 (i)w 1 (j)w 2 (k)w 2 (l) 

i,j,k,l=ll 
n 

+ 4E Yj f(^i,x j)f(x h x k )wi (i)wi (j)w 2 (i)w 2 (k) 

i,j,k=l 
n 

j)) wi(i)w 1 (j)w 2 (i)w 2 (j) 

n 

(B Wl (l) Wl (2)) 2 Y? f( x i, x j)f( x k, x l) 

i,j,k,l=l 
n 

+ 4(Ew 1 (l)w 1 (2)) 2 Y* f( x i, x j)f( x i, x k) 
i,j,k=l 
n 

+ 2(E»i(%(2)) 2 Y?Vte> x M 2 - 



This yields 

lim v 



N 
N—>co 



Ewi(l)wi (2)wi (3)wi (4) - (E^ (1)^(2))' 

• Y? f( w ' x )f(y> z ) 

+ 4 [E(wi(l))V(2H(3) - (E Wl (l) Wl (2)f 

• Y? f(. x iy)f( x > z ) 

+ 2 [e( Wi (1) Wi (2)) 2 - (B Wl (l) Wl (2)y 

■ Y* (/(*»»)) 2 

n 

a 4 Y f{ x i, x j)f( x k,xi) 

i,j,k,l=l 
n 

+ 4a 3 Y f( x i, x j)fi x i, x k) 

i,j,k=\ 
n 

+ 2a 2 Y* U( x i, x i)? 



with 



a 4 = (-An 2 + lOn - 6) /n 3 
a 3 = (n 3 - 7n 2 + Yin - 6) /n 3 
a 2 = (3n 3 - lln 2 + 14n - 6)/n 3 

Equation (||) 

The expectation value of v* is 

E? = a 1 /^(, 1 ,, 2 , ;C 3,, 4 )/(, I ,, 2 )/(,3,, 4 )d, I dx 2 dx3dx 1 

+ 4a 3 y £ (3) (xi,x 2 ,x 3 )/(xi,x 2 )/(xi,x 3 )dxidx 2 dx 3 

+ 2a 2 y £> (2) (xi,x 2 ) (/(xi,x 2 )) 2 dxidx 2 
= S4CK4 + 4s 3 a 3 + 2s 2 a 2 
In the limiting case (n — > 00) it is 

E?T* = 4s 3 + 6s 2 

(see Equation (||)). 
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